Approximate statistical alignment by iterative sampling of substitution matrices
نویسندگان
چکیده
We outline a procedure for jointly sampling substitution matrices and multiple sequence alignments, according to an approximate posterior distribution, using an MCMC-based algorithm. This procedure provides an efficient and simple method by which to generate alternative alignments according to their expected accuracy, and allows appropriate parameters for substitution matrices to be selected in an automated fashion. In the cases considered here, the sampled alignments with the highest likelihood have an accuracy consistently higher than alignments generated using the standard BLOSUM62 matrix.
منابع مشابه
Optimizing substitution matrices by separating score distributions
MOTIVATION Homology search is one of the most fundamental tools in Bioinformatics. Typical alignment algorithms use substitution matrices and gap costs. Thus, the improvement of substitution matrices increases accuracy of homology searches. Generally, substitution matrices are derived from aligned sequences whose relationships are known, and gap costs are determined by trial and error. To discr...
متن کاملSubstitution Matrices and Mutual Information Approaches to Modeling Evolution
Substitution matrices are at the heart of Bioinformatics: sequence alignment, database search, phylogenetic inference, protein family classi cation are based on Blosum, Pam, JTT, mtREV24 and other matrices. These matrices provide means of computing models of evolution and assessing the statistical relationships amongst sequences. This paper reports two results; rst we show how Bayesian and grid...
متن کاملA Transition Probability Model for Amino Acid Substitutions from Blocks
Substitution matrices have been useful for sequence alignment and protein sequence comparisons. The BLOSUM series of matrices, which had been derived from a database of alignments of protein blocks, improved the accuracy of alignments previously obtained from the PAM-type matrices estimated from only closely related sequences. Although BLOSUM matrices are scoring matrices now widely used for pr...
متن کاملSignificant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments.
The relative performances of four strategies for aligning a large number of protein sequences were assessed by referring to corresponding structural alignments of 54 independent families. Multiple sequence alignment of a family was constructed by a given method from the sequences of known structures and their homologues, and the subset consisting of the sequences of known structures was extract...
متن کاملGauss-Sidel and Successive Over Relaxation Iterative Methods for Solving System of Fuzzy Sylvester Equations
In this paper, we present Gauss-Sidel and successive over relaxation (SOR) iterative methods for finding the approximate solution system of fuzzy Sylvester equations (SFSE), AX + XB = C, where A and B are two m*m crisp matrices, C is an m*m fuzzy matrix and X is an m*m unknown matrix. Finally, the proposed iterative methods are illustrated by solving one example.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1501.04986 شماره
صفحات -
تاریخ انتشار 2015